350 research outputs found
Finite-size effects in on-line learning of multilayer neural networks
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time
Speaker diarization using gesture and speech
We demonstrate how the problem of speaker diarization can be solved using both gesture and speaker parametric models. The novelty of our solution is that we approach the speaker diarization problem as a speaker recognition problem after learning speaker models from speech samples corresponding to gestures (the occurrence of gestures indicates the presence of speech and the location of gestures indicates the identity of the speaker). This new approach offers many advantages: comparable state-of-the-art performance, faster computation and more adaptability. In our implementation, parametric models are used to model speakers' voice and their gestures: more specifically, Gaussian mixture models are used to model the voice characteristics of each person and all persons, and gamma distributions are used to model gestural activity based on features extracted from Motion History Images. Tests on 4.24 hours of the AMI meeting data show that our solution makes DER score improvements of 19% on speech-only segments and 4% on all segments including silence (the comparison is with the AMI system)
Investigation of topographical stability of the concave and convex Self-Organizing Map variant
We investigate, by a systematic numerical study, the parameter dependence of
the stability of the Kohonen Self-Organizing Map and the Zheng and Greenleaf
concave and convex learning with respect to different input distributions,
input and output dimensions
Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models
Shapley values underlie one of the most popular model-agnostic methods within
explainable artificial intelligence. These values are designed to attribute the
difference between a model's prediction and an average baseline to the
different features used as input to the model. Being based on solid
game-theoretic principles, Shapley values uniquely satisfy several desirable
properties, which is why they are increasingly used to explain the predictions
of possibly complex and highly non-linear machine learning models. Shapley
values are well calibrated to a user's intuition when features are independent,
but may lead to undesirable, counterintuitive explanations when the
independence assumption is violated.
In this paper, we propose a novel framework for computing Shapley values that
generalizes recent work that aims to circumvent the independence assumption. By
employing Pearl's do-calculus, we show how these 'causal' Shapley values can be
derived for general causal graphs without sacrificing any of their desirable
properties. Moreover, causal Shapley values enable us to separate the
contribution of direct and indirect effects. We provide a practical
implementation for computing causal Shapley values based on causal chain graphs
when only partial information is available and illustrate their utility on a
real-world example.Comment: Accepted at 34th Conference on Neural Information Processing Systems
(NeurIPS 2020
Unsupervised feature learning for visual sign language identification
Prior research on language identification focused primarily on text and speech. In this paper, we focus on the visual modality and present a method for identifying sign languages solely from short video samples. The method is trained on unlabelled video data (unsupervised feature learning) and using these features, it is trained to discriminate between six sign languages (supervised learning). We ran experiments on video samples involving 30 signers (running for a total of 6 hours). Using leave-one-signer-out cross-validation, our evaluation on short video samples shows an average best accuracy of 84%. Given that sign languages are under-resourced, unsupervised feature learning techniques are the right tools and our results indicate that this is realistic for sign language identification
Dynamical and Stationary Properties of On-line Learning from Finite Training Sets
The dynamical and stationary properties of on-line learning from finite
training sets are analysed using the cavity method. For large input dimensions,
we derive equations for the macroscopic parameters, namely, the student-teacher
correlation, the student-student autocorrelation and the learning force
uctuation. This enables us to provide analytical solutions to Adaline learning
as a benchmark. Theoretical predictions of training errors in transient and
stationary states are obtained by a Monte Carlo sampling procedure.
Generalization and training errors are found to agree with simulations. The
physical origin of the critical learning rate is presented. Comparison with
batch learning is discussed throughout the paper.Comment: 30 pages, 4 figure
- …